The Construction of Bilingual Knowledge Bank based on the Synchronous SSTC Annotation Schema
نویسنده
چکیده
In this paper, we would like to present an approach to construct a huge Bilingual Knowledge Bank (BKB) from a given bilingual corpus based on the idea of synchronous Structured String-Tree Correspondence (SSTC). The SSTC is a general structure that can associate an arbitrary tree structure to string in a language as desired by the annotator to be the interpretation structure of the string, and more importantly is the facility to specify the correspondence between the string and the associated tree which can be non-projective. With this structure, we are able to match linguistic units at different inter levels of the structure (i.e. define the correspondence between substrings in the sentence, nodes in the tree, subtrees in the tree and subcorrespondences in the SSTC). This flexibility makes synchronous SSTC very well suited for the construction of a Bilingual Knowledge Bank we need for the English-Malay MT application.
منابع مشابه
A Synchronization Structure Of SSTC And Its Applications In Machine Translation
In this paper, a flexible annotation schema called (SSTC) is introduced. In order to describe the correspondence between different languages, we propose a variant of SSTC called synchronous SSTC (S-SSTC). We will also describe how S-SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other synchronous formalisms. The proposed S-SSTC schema is well sui...
متن کاملSynchronous Structured String-Tree Correspondence (S-SSTC)
In this paper, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is introduced. We propose a variant of SSTC called synchronous SSTC. Synchronous SSTC can be used to describe the correspondence between different languages. We will also describe how synchronous SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other syn...
متن کاملExample-Based Machine Translation Based on the Synchronous SSTC Annotation Schema
In this paper, we describe an Example-Based Machine Translation (EBMT) system for EnglishMalay translation. Our approach is an examplebased approach which relies sorely on example translations kept in a Bilingual Knowledge Bank (BKB). In our approach, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is used to annotate both the source and target sentences of a tr...
متن کاملConverting a Bilingual Dictionary into a Bilingual Knowledge Bank based on the Synchronous SSTC
In this paper, we would like to present an approach to construct a huge Bilingual Knowledge Bank (BKB) from an English Malay bilingual dictionary based on the idea of synchronous Structured String-Tree Correspondence (SSTC). The SSTC is a general structure that can associate an arbitrary tree structure to string in a language as desired by the annotator to be the interpretation structure of the...
متن کاملA Flexible Example-based Parser Based on the Sstc"
In this paper we sketch an approach for Natural Language parsing. Our approach is an example-based approach, which relies mainly on examples that already parsed to their representation structure, and on the knowledge that we can get from these examples the required information to parse a new input s e n t e n c e . In our approach, examples are annotated with the Structured String Tree Correspo...
متن کامل